144 research outputs found

    Development of mathematical methods for modeling biological systems

    Get PDF

    The out-of-sample R2R^2: estimation and inference

    Full text link
    Out-of-sample prediction is the acid test of predictive models, yet an independent test dataset is often not available for assessment of the prediction error. For this reason, out-of-sample performance is commonly estimated using data splitting algorithms such as cross-validation or the bootstrap. For quantitative outcomes, the ratio of variance explained to total variance can be summarized by the coefficient of determination or in-sample R2R^2, which is easy to interpret and to compare across different outcome variables. As opposed to the in-sample R2R^2, the out-of-sample R2R^2 has not been well defined and the variability on the out-of-sample R^2\hat{R}^2 has been largely ignored. Usually only its point estimate is reported, hampering formal comparison of predictability of different outcome variables. Here we explicitly define the out-of-sample R2R^2 as a comparison of two predictive models, provide an unbiased estimator and exploit recent theoretical advances on uncertainty of data splitting estimates to provide a standard error for the R^2\hat{R}^2. The performance of the estimators for the R2R^2 and its standard error are investigated in a simulation study. We demonstrate our new method by constructing confidence intervals and comparing models for prediction of quantitative Brassica napus\text{Brassica napus} and Zea mays\text{Zea mays} phenotypes based on gene expression data

    Endoreplication as a potential driver of cell wall modifications

    Get PDF
    Endoreplication represents a variant of the mitotic cell cycle during which cells replicate their DNA without mitosis and/or cytokinesis, resulting in an increase in the cells’ ploidy level. This process is especially prominent in higher plants, where it has been correlated with cell differentiation, metabolic output and rapid cell growth. However, different reports argue against a ploidy-dependent contribution to cell growth. Here, we review accumulating data suggesting that endocycle onset might exert an effect on cell growth through transcriptional control of cell wall-modifying genes to drive cell wall changes required to accommodate turgor-driven rapid cell expansion, consistent with the idea that vacuolar expansion rather than a ploidy-driven increase in cellular volume represents the major force driving cell growth

    Gene duplicability of core genes is highly consistent across all angiosperms

    Get PDF
    Gene duplication is an important mechanism for adding to genomic novelty. Hence, which genes undergo duplication and are preserved following duplication is an important question. It has been observed that gene duplicability, or the ability of genes to be retained following duplication, is a nonrandom process, with certain genes being more amenable to survive duplication events than others. Primarily, gene essentiality and the type of duplication (small-scale versus large-scale) have been shown in different species to influence the (long-term) survival of novel genes. However, an overarching view of "gene duplicability" is lacking, mainly due to the fact that previous studies usually focused on individual species and did not account for the influence of genomic context and the time of duplication. Here, we present a large-scale study in which we investigated duplicate retention for 9178 gene families shared between 37 flowering plant species, referred to as angiosperm core gene families. For most gene families, we observe a strikingly consistent pattern of gene duplicability across species, with gene families being either primarily single-copy or multicopy in all species. An intermediate class contains gene families that are often retained in duplicate for periods extending to tens of millions of years after whole-genome duplication, but ultimately appear to be largely restored to singleton status, suggesting that these genes may be dosage balance sensitive. The distinction between single-copy and multicopy gene families is reflected in their functional annotation, with single-copy genes being mainly involved in the maintenance of genome stability and organelle function and multicopy genes in signaling, transport, and metabolism. The intermediate class was overrepresented in regulatory genes, further suggesting that these represent putative dosage-balance-sensitive genes

    Evolutionary context improves regulatory network predictions

    Get PDF
    A novel algorithm harnesses phylogenetic information and facilitates a better understanding of the evolutionary divergence of gene regulation between species

    Analysis of 41 plant genomes supports a wave of successful genome duplications in association with the Cretaceous-Paleogene boundary

    Get PDF
    Ancient whole-genome duplications (WGDs), also referred to as paleopolyploidizations, have been reported in most evolutionary lineages. Their attributed role remains a major topic of discussion, ranging from an evolutionary dead end to a road toward evolutionary success, with evidence supporting both fates. Previously, based on dating WGDs in a limited number of plant species, we found a clustering of angiosperm paleopolyploidizations around the Cretaceous Paleogene (K-Pg) extinction event about 66 million years ago. Here we revisit this finding, which has proven controversial, by combining genome sequence information for many more plant lineages and using more sophisticated analyses. We include 38 full genome sequences and three transcriptome assemblies in a Bayesian evolutionary analysis framework that incorporates uncorrelated relaxed clock methods and fossil uncertainty. In accordance with earlier findings, we demonstrate a strongly nonrandom pattern of genome duplications over time with many WGDs clustering around the K-Pg boundary. We interpret these results in the context of recent studies on invasive polyploid plant species, and suggest that polyploid establishment is promoted during times of environmental stress. We argue that considering the evolutionary potential of polyploids in light of the environmental and ecological conditions present around the time of polyploidization could mitigate the stark contrast in the proposed evolutionary fates of polyploids

    Correlation analysis of the transcriptome of growing leaves with mature leaf parameters in a maize RIL population

    Get PDF
    Background: To sustain the global requirements for food and renewable resources, unraveling the molecular networks underlying plant growth is becoming pivotal. Although several approaches to identify genes and networks involved in final organ size have been proven successful, our understanding remains fragmentary. Results: Here, we assessed variation in 103 lines of the Zea mays B73xH99 RIL population for a set of final leaf size and whole shoot traits at the seedling stage, complemented with measurements capturing growth dynamics, and cellular measurements. Most traits correlated well with the size of the division zone, implying that the molecular basis of final leaf size is already defined in dividing cells of growing leaves. Therefore, we searched for association between the transcriptional variation in dividing cells of the growing leaf and final leaf size and seedling biomass, allowing us to identify genes and processes correlated with the specific traits. A number of these genes have a known function in leaf development. Additionally, we illustrated that two independent mechanisms contribute to final leaf size, maximal growth rate and the duration of growth. Conclusions: Untangling complex traits such as leaf size by applying in-depth phenotyping allows us to define the relative contributions of the components and their mutual associations, facilitating dissection of the biological processes and regulatory networks underneath

    Nonrandom divergence of gene expression following gene and genome duplications in the flowering plant Arabidopsis thaliana

    Get PDF
    BACKGROUND: Genome analyses have revealed that gene duplication in plants is rampant. Furthermore, many of the duplicated genes seem to have been created through ancient genome-wide duplication events. Recently, we have shown that gene loss is strikingly different for large- and small-scale duplication events and highly biased towards the functional class to which a gene belongs. Here, we study the expression divergence of genes that were created during large- and small-scale gene duplication events by means of microarray data and investigate both the influence of the origin (mode of duplication) and the function of the duplicated genes on expression divergence. RESULTS: Duplicates that have been created by large-scale duplication events and that can still be found in duplicated segments have expression patterns that are more correlated than those that were created by small-scale duplications or those that no longer lie in duplicated segments. Moreover, the former tend to have highly redundant or overlapping expression patterns and are mostly expressed in the same tissues, while the latter show asymmetric divergence. In addition, a strong bias in divergence of gene expression was observed towards gene function and the biological process genes are involved in. CONCLUSION: By using microarray expression data for Arabidopsis thaliana, we show that the mode of duplication, the function of the genes involved, and the time since duplication play important roles in the divergence of gene expression and, therefore, in the functional divergence of genes after duplication

    Origins, evolution, domestication and diversity of Saccharomyces beer yeasts

    Get PDF
    Yeasts have been used for food and beverage fermentations for thousands of years. Today, numerous different strains are available for each specific fermentation process. However, the nature and extent of the phenotypic and genetic diversity and specific adaptations to industrial niches have only begun to be elucidated recently. In Saccharomyces, domestication is most pronounced in beer strains, likely because they continuously live in their industrial niche, allowing only limited genetic admixture with wild stocks and minimal contact with natural environments. As a result, beer yeast genomes show complex patterns of domestication and divergence, making both ale (S. cerevisiae) and lager (S. pastorianus) producing strains ideal models to study domestication and, more generally, genetic mechanisms underlying swift adaptation to new niches

    Expansive evolution of the TREHALOSE-6-PHOSPHATE PHOSPHATASE gene family in Arabidopsis

    Get PDF
    Trehalose is a nonreducing sugar used as a reserve carbohydrate and stress protectant in a variety of organisms. While higher plants typically do not accumulate high levels of trehalose, they encode large families of putative trehalose biosynthesis genes. Trehalose biosynthesis in plants involves a two-step reaction in which trehalose-6-phosphate (T6P) is synthesized from UDPglucose and glucose-6-phosphate (catalyzed by T6P synthase [TPS]), and subsequently dephosphorylated to produce the disaccharide trehalose (catalyzed by T6P phosphatase [TPP]). In Arabidopsis (Arabidopsis thaliana), 11 genes encode proteins with both TPS- and TPP-like domains but only one of these (AtTPS1) appears to be an active (TPS) enzyme. In addition, plants contain a large family of smaller proteins with a conserved TPP domain. Here, we present an in-depth analysis of the 10 TPP genes and gene products in Arabidopsis (TPPA-TPPJ). Collinearity analysis revealed that all of these genes originate from whole-genome duplication events. Heterologous expression in yeast (Saccharomyces cerevisiae) showed that all encode active TPP enzymes with an essential role for some conserved residues in the catalytic domain. These results suggest that the TPP genes function in the regulation of T6P levels, with T6P emerging as a novel key regulator of growth and development in higher plants. Extensive gene expression analyses using a complete set of promoter-beta-glucuronidase/green fluorescent protein reporter lines further uncovered cell- and tissue-specific expression patterns, conferring spatiotemporal control of trehalose metabolism. Consistently, phenotypic characterization of knockdown and overexpression lines of a single TPP, AtTPPG, points to unique properties of individual TPPs in Arabidopsis, and underlines the intimate connection between trehalose metabolism and abscisic acid signaling
    corecore